New Gradient-Spatial-Structural Features for video script identification

نویسندگان

  • Palaiahnakote Shivakumara
  • Ze-Huan Yuan
  • Danni Zhao
  • Tong Lu
  • Chew Lim Tan
چکیده

Multi-script identification helps in automatically selecting an appropriate OCR when video has several scripts; however, script identification in video frames is challenging because low resolution and complex background of video often cause disconnections or the loss of text information. This paper presents a novel idea that integrates the Gradient-Spatial-Features (GSpF) and the Gradient-StructuralFeatures (GStF) at block level based on an error factor and the weights of the features to identify six video scripts, namely, Arabic, Chinese, English, Japanese, Korean and Tamil. Horizontal and vertical gradient values are first computed for each text block to increase the contrast of text pixels. Then the method divides the horizontal and the vertical gradient blocks into two equal parts at the centroid in the horizontal direction. Histogram operation on each part is performed to select dominant text pixels from respective subparts of the horizontal and the vertical gradient blocks, which results in text components. After extracting GSpF and GStF from the text components, we finally propose to integrate the spatial and the structural features based on end points, intersection points, junction points and straightness of the skeleton of text components in a novel way to identify the scripts. The method is evaluated on 970 video frames of six scripts which involves font, font size or contrast variations, and is compared with an existing method in terms of classification rate. Experimental results show that the proposed method achieves 83.0% average classification rate for video script identification. The method is also evaluated by testing on noisy images and scanned low resolution documents, illustrating the robustness and the extensibility of the proposed gradient-spatial-structural features. KeywordsVideo text blocks, Gradient blocks, Dominant video text pixels, Gradient-Spatialstructural-features, Video script identification

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of arabic word from bilingual text using character features

The identification of the language of the script is an important stage in the process of recognition of the writing. There are several works in this research area, which treat various languages. Most of the used methods are global or statistical. In this present paper, we study the possibility of using the features of scripts to identify the language. The identification of the language of the s...

متن کامل

Multilingual Artificial Text Extraction and Script Identification from Video Images

This work presents a system for extraction and script identification of multilingual artificial text appearing in video images. As opposed to most of the existing text extraction systems which target textual occurrences in a particular script or language, we have proposed a generic multilingual text extraction system that relies on a combination of unsupervised and supervised techniques. The un...

متن کامل

Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network

Script identification plays a significant role in analysing documents and videos. In this paper, we focus on the problem of script identification in scene text images and video scripts. Because of low image quality, complex background and similar layout of characters shared by some scripts like Greek, Latin, etc., text recognition in those cases become challenging. Most of the recent approaches...

متن کامل

Multi-script Off-line Signature Verification: A Two Stage Approach

Signature identification and verification are of great importance in authentication systems. The purpose of this paper is to introduce an experimental contribution in the direction of multi-script off-line signature identification and verification using a novel technique involving off-line English, Hindi (Devnagari) and Bangla (Bengali) signatures. In the first evaluation stage of the proposed ...

متن کامل

تأثیر خط کوفی بر خط کوفی بنایی و تحول آن تا آرم‌نویسی‌های امروز

Square or geometric Kufic (also known as banna’i) developed from the Kufic script and consists of repeating vertical, horizontal, and parallel geometric units on a geometric network called a grid. Due to its unique visual characteristics and its close association with architecture, it is very promising for use in contemporary arts. This paper presents a research on contributing factors to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Vision and Image Understanding

دوره 130  شماره 

صفحات  -

تاریخ انتشار 2015